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ABSTRACT 


y  The  generalized  multivariate  median  of  H.  Oja  is  used  to  define  a 
multivariate  notion  of  quantile,  or  rank,  and  to  define  a  measure  of 
scatter  of  multivariate  linear  models.  The  latter,  when  applied  to  the 
one-  and  two-sample  bivariate  location  models,  yields  affine  invariant 
analogs  of  the  Wilcoxon  rank-sum  and  signed-rank  tests,  and  of  the 


corresponding  estimates. 


KEY  WORDS:  affine  invariance,  bivariate  location  model,  dispersion 
measures*, n generalized  median,  multivariate  linear  models,  multivariate 
quantile,  multivariate  rank,  permutation  tests,  R-estimates,  Wilcoxon 


tests. 
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1.  INTRODUCTION 


This  paper  introduces  a  notion  of  multivariate  quantile  or  rank 
and  uses  it  to  develop  affine  invariant  analogs  of  rank  tests  and 
R-estioates  in  the  one-  and  tvo-sample  bivariate  location  models. 

Bickel  (1964)  investigates  the  non-affine  invariant  vectors  of 
medians  and  medians  of  pairwise  averages.  These  are  the  Hodges-Lehmann 
(1963)  R-estimates  derived  from  the  application  of  the  univariate  sign 
and  Wilcoxon  signed  rank  statistics,  respectively,  to  the  components. 

In  comparing  these  estimates  with  the  sample  mean  vector,  Bickel 
concludes  that  despite  encouraging  results  on  robustness  and  efficiency 
there  remains  some  pathological  behavior  of  these  estimates  when  the 
components  of  the  data  vectors  are  highly  correlated.  He  further 
concludes  that  the  bad  behavior  may  be  due  in  part  to  the  lack  of 
affine  invariance  of  these  estimates.  Bickel  (1965)  draws  a  similar 
conclusion  for  tests  based  on  vectors  of  univariate  rank  statistics.  A 

/s 

different  robust  estimate,  the  spatial  median  0,  which  minimizes  the  sum 

A 

of  distances  from  0  to  the  data  vectors  also  fails  to  be  affine 
invariant;  see  Gower  (197*0  and  Brown  (1983).  In  addition,  there  may 
be  compelling  reasons  based  on  the  measurement  scales  in  the  model  to 
seek  affine  invariant  rank  methods. 

Oja  (1983)  defines  a  generalized  median  0  which  minimizes  a 
measure  of  scatter  defined  by  the  sum  of  areas  of  triangles  formed  by- 
taking  0  along  with  pairs  of  data  points  as  vertices.  The  Oja 
generalized  median  is  affine  invariant.  Oja  and  Niinimaa  (1985)  study 
the  efficiency  of  the  generalized  median  and,  in  the  case  of  bivariate 
normality,  show  it  to  be  as  efficient  as  the  spatial  median. 

We  introduce  a  bivariate  quantile  (or  rank)  based  on  the  gradient 
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of  Oja's  measure  of  scatter.  We  use  this  quantile  to  develop  affine 
invariant  tests  and  estimates  in  the  one-  and  tvo-sample  bivariate 
location  models.  The  tests  are  analogs  of  the  Wilcoxon  signed  rank 
test  and  the  Mann-Whitney-Wilcoxon  rank  sum  test,  respectively,  and  the 
estimates  are  bivariate  analogs  of  R-estimates.  Our  approach  is 

similar  to  that  of  Jaeckel  (1972)  and  developed  in  Hettmansperger  (1984, 

% 

Chapter  5)*  We  first  construct  a  measure  of  dispersion  of  residuals  in 
a  linear  model.  The  dispersion  is  a  linear  function  of  the  bivariate 
residuals  in  which  the  coefficients  depend  on  the  size  or  quantile  of 
the  residuals.  This  dispersion,  which  is  related  to  Oja's  scatter 
measure,  provides  estimates  through  minimization  and  tests  from  its 
gradient  vector. 

The  quantiles  and  generalized  median  are  discussed  in  Section  2 
and  the  dispersion  based  on  quantiles  is  defined  in  Section  3.  The 
two-  and  one-sample  location  models  are  treated  in  Sections  4  and  5, 
respectively,  and  the  statistics  are  illustrated  in  Section  6. 

2.  THE  BIVARIATE  QUANTILE 
Let  xj,« • • ,xn,6  be  2x1  vectors  and  let 
(1)  T(9)  =  l  l  A(xi,xj;6) 

i<j 

where  A(xi,xj;0)  is  the  area  of  the  triangle  formed  with  xi,xj,  and  0  as 

A 

vertices.  This  is  the  Oja  (1983)  measure  of  scatter.  The  value  9 
which  minimizes  T(0)  is  the  Oja  generalized  median  of  the  bivariate 
sample. 


Given  xT  -  (xi.xg).  define  xT  -  (-xg.xi).  Then  x  has  the  same 
length  as  x  and  is  rotated  through  ir/2  radians  in  a  counter  clockwise 
direction.  It  follows  that 

(2)  T(e)  -  \  l  zl(*r*.)T(e-x.)  |. 

i<j  J  1 

The  quantile  vector  Q(8)  is  defined  by  Q(e)  ■  VT(e),  the  vector  of 
partial  derivatives  of  T(e).  Thus  the  bivariate  quantile  has  both 
magnitude  and  direction.  Quantiles  with  largest  magnitude  correspond  to  6 
being  near  or  beyond  the  convex  hull  of  the  sample.  TJiose  with  small 
"magnitude  correspond  to  points  well  embedded  in  the  sample.  Further,  — Q( 9 ) 
provides  the  direction  of  steepest  descent  on  the  convex  surface  defined  by 
T(e)  and  points  towards  the  mass  of  the  sample.  An  equivalent  definition 

A 

of  the  Oja  generalized  median  0  is 

(3)  Q(6)  *  0, 

where  "i0"  means  that  |  Q(0)  j  is  a  minimum.  The  equation  (3)  may  determine 
a  single  point  or  a  convex  set  of  points  from  which  the  median  cam  be 
selected;  see  Oja  (1983). 

Note  that 

(jcJ-*i)T(e-xi)  -  det(*i  *J  ®). 

Now,  from  (2),  it  is  easy  to  show  that 
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(4)  Q (0)  -  *  l  l  u(x.,x.;6) 

i<j  J 


where 

X  X  0 

(5)  u(x^,Xj{8)  -  sgnfdet  ^ 


The  vector  u(xi,xj;8)  has  magnitude  {  xj-xj  |  and  direction  perpendicular  to 
and  away  from  the  chord  defined  by  (x^.xj)  toward  9;  that  is,  u(xi,xj;8)  - 
Xi~xj  if  the  order  x^.xj.e  is  clockwise,  but  xj-x^  if  the  order  x^.xj.e  is 
counter  clockwise.  Hence,  Q(8)  is  \  the  sum  of  "repulsion"  vectors 
u(xi,xj;0)  away  from  the  chords  defined  by  pairs  of  points  (x^.xj)  toward 
8. 

Using  the  geometry  described  in  the  previous  paragraph,  or  by 
algebraic  manipulation  of  (4),  it  follows  that 


(6)  l  Q(xj[)  -  0, 


so  the  sample  quantiles  are  centered. 

In  addition,  the  following  observations  can  be  helpful  in  determining 
quantiles  or  locating  the  generalized  median: 

(i)  When  three  chords  form  a  triangle,  the  sum  of  repulsion  vectors  is 
zero  for  any  8  within  the  triangle;  see  Figure  1.  More  generally, 
if  A  is  a  closed,  convex  loop  of  chords,  the  sum  of  repulsion 
vectors  is  zero  for  any  e  in  A. 

(ii)  If  8  is  on  the  line  extended  indefinitely  through  two  data  points, 
then  the  repulsion  vector  due  to  those  points  is  zero. 
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(iii)  If  e  lies  outside  of  a  triangle  of  chords,  then  the  sum  of  repulsion 
vectors  due  to  the  chords  is  twice  the  repulsion  vector  of  the  most 
transverse  side.  See  Figure  1.  Calculating  rules  are  possible  for 
other  convex  polygons,  but  they  are  too  complicated  to  be  of  much 
practical  value. 

The  graph  of  all  lines  through  pairs  of  data  points  (x^.Xj)  and 
extended  indefinitely  in  both  directions  breaks  the  plane  into  many 
polygonal  regions.  The  quantile  Q(0)  changes  as  0  passes  from  one  region 
to  another.  The  quantile  on  a  border  is  the  average  of  the  quantiles  in 
the  contiguous  polygons.  A  computer  algorithm  is  the  most  efficient  way  of 
computing  quantiles.  The  following  remarks  provide  some  insight  into  the 
computation  of  quantiles  without  using  a  computer. 

(a)  To  find  Q(e)  use  (i)  to  eliminate  as  many  closed  loops  of 

chords  containing  0  as  possible.  Parts  (ii)  and  (iii)  often  provide 
further  reductions.  Then  Q(0)  is  the  sum  of  the  repulsion  vectors 
due  to  the  remaining  chords. 

A 

(b)  To  locate  the  generalized  median  9  (or  median  set)  delete  successive 
closed  loops  using  (i)  and  beginning  with  the  convex  hull  and 
moving  inward.  When  no  further  reduction  is  possible,  the  resulting 

/v 

configuration  of  extended  chords  must  be  examined  to  find  0  that 
minimizes  |  Q(0)  |.  Generally,  all  that  remains  is  either  one  region, 
whence  all  0  inside  are  medians,  or  one  region  cut  by  concurrent 
diagonals,  whence  the  intersection  point  of  the  diagonals  is  the 
median.  It  is  quite  possible,  however,  that  9  is  on  a  border 
between  polygonal  regions.  This  method  is  equivalent  to  deleting 
polygons  in  stages;  at  each  stage,  delete  all  regions  with  a  side 
which  is  part  of  the  current  outer  boundary.  A  new  boundary 


forms  at  each  stage.  Stop  when  there  are  no  further  boundaries  to 
eliminate.  See  Figure  2. 

Thus  the  quantile  vector  generalizes  the  idea  of  a  centered  rank  in  a 
univariate  sample.  The  magnitudes  J  Q(x^)  |  order  the  depth  of  the 
observations  in  the  sample,  and  the  directions  -Q(xj)  point  toward  the 
center  of  the  data. 

In  the  next  section  we  introduce  a  measure  of  dispersion  based  on  the 
quantiles.  We  show  that  it  is  related  to  the  Oja  scatter  measure  (1). 

-  Put  Figures  1  and  2  about  here  - 

3.  DISPERSION 

Jaeckel  (1972)  derived  R-estimates  in  the  linear  model  from  a  measure 

of  dispersion  of  the  residuals.  In  the  univariate  linear  model,  let  the 

T  T 

residual  rA  be  given  by  r^  -  yj-a-ZiB  where  z*  is  a  1 *p  row  vector  of  known 
values,  B  is  a  pxi  vector  of  unknown  regression  parameters  and  a  is  an 
unknown  scalar  intercept  parameter.  Then  an  R-estimate  of  B  is  defined  as 

/v 

the  vector  0  that  minimizes 

n  — 

(7)  D ( 8 )  -  l  [Rank(y.-zjB)  -  (n+1 )/2] (y  -z:B) . 

i-1 

This  dispersion  measure  is  invariant  with  respect  to  a.  Jaeckel  showed 
that  B  generalizes  the  notion  of  an  R-estimate  from  simple  location  models 
to  the  linear  model.  McKean  and  Hettmansperger  (1976)  developed  tests  for 


HB  -  0,  based  on  (7),  where  H  is  a  specified  qxp  matrix.  See 
Hettmansperger  (1984,  Chapter  5)  for  the  details. 

The  multivariate  linear  model  can  be  built  by  appending  univariate 
linear  models  in  the  following  fashion: 

Let  Y  be  an  nxq  matrix  in  which  the  n  rows  are  independent  random  vectors 
such  that 


where  Z  is  an  n»p  matrix  of  given  regression  constants  and  6  is  a  pxq 
matrix  of  unknown  parameters.  If  Y^*)  and  B^  are  the  ith  columns  of  Y 
and  B,  respectively,  then  EY^)  -  ZB^  is  the  univariate  linear  model 
described  in  the  previous  paragraph. 

Let  rj[  -  Yi  -  BtZ£  denote  the  ith  q*1  residual  vector  where  is  the 
ith  row  of  Z.  Then  (7)  becomes 

(9)  D(b)  -  l 

In  the  following  sections  we  will  consider  the  special  cases  of  the  two- 
and  one-sample  bivariate  location  models.  First,  however,  we  will  show 
that  in  the  bivariate  case  (q  -  2),  D ( 0 )  given  by  (9)  is  related  to  Oja’s 
scatter  measure  (1). 


Theorem 


(10)  D(g)  =  k  H  l  A(ri<ri,rk) 

i<j<k  J 

Proof.  Since  l  Q^r^)  =  0  we  have  £  Q^r^Jr.  =  0. 

i 

Hence, 


D(B)  =  I  QT(ri)ri 
i 

=  I  QT(ri)(ri-rJ) 
i 

*  IUV2  8gn|det(^  '*)  }  (!•„-? j  Wn-rj  ) 

i  j<k  1  X  ' 

=  III  A(rj,rk,ri) 

i  j<k 

*  V2.Hl  A{riJ,rk,ri) 

i  j  k 

=  V2-*'  HI  A(rj,rk,ri) 

Kj<k 


Thus,  our  dispersion  measure  is  proportional  to  the  sum  of  areas 
of  triangles  with  residuals  at  the  vertices.  The  scale  invariant 

A 

R-like  estimate  is  the  matrix  0  that  minimizes  this  sum  of  triangular 
areas. 


4.  THE  TWO  SAMPLE  LOCATION  MODEL 

In  the  bivariate  two-sample  location  model  the  matrix  0  in  (8)  can 
be  replaced  by  a  vector.  The  intercept  part  of  the  linear  model  does 
not  affect  the  difference  in  locations.  Accordingly,  in  the  bivariate 
two-sample  problem  there  are  n^  observations  xi,...,xni  and  n2 
observations  yi,...,yn2*  Let  A  be  the  location  shift  vector  applied  to 


the  y-  sample,  so  that  the  residuals  are  either  x^  or  yj-A.  Then  the 
terms  of  D  are  areas  of  triangles  whose  vertices  number  s  from  the 
x-sample  and  3-s  from  the  y-  sample,  for  s  =  0,1, 2, 3.  The  next  result 
shows  that,  similar  to  the  univariate  case,  the  dispersion  depends  on 
the  y-x  differences. 

Theorem. 

(11)  D(A)  =  inconstant  +  £  £  £  A(yk-Xi,yk-Xj ,A) 

*•  k  i<j 

+  l  l  l  Myj-xi.yk-xi.A)  l 
i  j<k  } 

Proof .  Apply  D  in  (10)  to  the  combined  samples.  Note  that  areas  are 
not  affected  by  the  same  displacement  applied  to  all  three  vertices  or 
by  sign  changes.  Hence,  areas  that  involve  three  x's  or  three  y’s  do 
not  depend  on  A.  We  now  have 

D( A)  =  ^constant  +  \  \  A(xi,xj  ,yk-A) 

1  k  i<j 

+  lll  A(yj-A,yk-A,xi)|  , 

i  j<k  J 

but  A(xi,xj ,yk-A)  =  A(yk-x^ ,yk-xj , A )  and 

A(yj-A,yk-A,Xi)  =  A(yj-Xi,yk-Xi,A).  This  completes  the  argument. 

The  gradient  of  D(A)  is  given  by 

(12)  S(A)  =  V2  l  l  l  u(yk-Xi ,yk-xj ;A) 

k  i<j 

+  V2  l  l  l  u(yj“xi  ,yk”Xi  ’>A) . 
i  j<  k 

It  is  sufficient  to  consider  testing  the  null  hypothesis  Hq:  A  =  ( 


against  either  a  general  alternative  Hj_:  A  /  0  or  some  directional 
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alternative  stating  that  A  differs  from  0  in  some  fixed  direction. 
A  test  based  on  the  quantiles  uses  S  =  S(0),  the  gradient  of  D(a) 
evaluated  at  the  hypothesized  value. 

The  next  result  shows  that ,  like  the  univariate  rank  sum 
statistic,  S  reduces  to  the  sum  of  the  x-quantiles  in  the  combined 
sample. 

Theorem. 

nl 

(13)  S  =  l  Q(xi) 

i-1 

Proof.  From  (12)  we  have 


2S  =  l  l  l  u(-xi,-xj;-yk)  +  l  l  l  u(yj,yk;xi) 

k  i<j  i  j<k 

“  I  I  I  u(yj,yk;xi)  “III  u<xi»xj;yk). 

i  j<k  k  i<j 

Note  that  u(xi,xj;yk)  +  u(xj,yk’,Xi)  +  u(yk,Xi;xj)  *  0  and  recall  from  (6) 

that  J  J  J  ^Xj.XkjXi)  =  0.  With  these  facts,  2S  reduces  to 
i  j<k 

III  u(yj,yk;xi)  +  l  l  l  u(xj,yk;Xi) 

i  j<k  i^jk 

+  I  I  I  u(xj,xkiXi) 
i  j<k 

which,  when  compared  to  (U),  is  seen  to  be  the  result  stated  in  the 
theorem. 

The  test  statistic  is  a  clear  analog  of  the  Mann-Whitney-Wilcoxon 

rank  sum  statistic  in  the  univariate  case.  The  set  x^  +  A,...xn^  +  A 
nl+n2 

is  one  of  (  ^  )  equally  likely  subsamples  of  n^  bivariate  observations 

drawn  from  the  combined  x  +  A  and  y  set  of  size  n^  +  n2.  Hence, 
standard  permutation  arguments  show  that  S(a)  =  J  Q(xj+A)  has 


expectation  0.  The  natural  estimate  A  is  A  such  that  s(a)  =  0. 

This  is  the  analog  of  the  Hodges-Lehmann  (1963)  estimate  of  shift  in  the 

A 

univariate  two  sample  problem.  Equation  (12)  shows  that  A  is  an  Oja 
generalized  median  computed  on  the  cross  sample  differences. 

Under  the  null  hypothesis  Ho:  A  =  0,  the  permutation  argument 
shows  that  ES  *  0  and  the  covariance  matrix  of  S  is 


(14) 


C  = 


"ln2 


"l+n2 


q(2i)q  (V 


where  zq,. . . »znj+n2  represents  the  combined  sample.  Furthermore,  S 

will  be  approximately  bivariate  normal  for  large  n^,n2. 

An  approximately  size  a  test  for  Hq:  A  «  0  against  H^:  A  *  0  rejects 

Hq  if  STC-1S  >  x^(2)  where  x^(2)  is  the  1-a  percentile  from 

a  chi-square  distribution  with  2  degrees  of  freedom.  To  test  H0:  A  *  0 

against  an  alternative  specifying  a  fixed  direction  with  unit  vector  v, 

project  on  v  yielding  vTS  with  null  covariance  matrix  vTCv.  We  reject  Hq 

if  vTS/(vTCv)Js  >  z  where  z  is  the  1-a  percentile  from  the  standard  normal 
a  a 

distribution. 

5.  THE  ONE  SAMPLE  LOCATION  MODEL 

Suppose  x-},...,xn  is  a  sample  from  a  bivariate  distribution  with  the 
property  that  x-e  and  8-x  have  the  same  distribution.  Then  e  represents 
the  center  of  the  distribution. 

In  testing  Hq:  6-0,  it  is  difficult  to  develop  a  simple  sign-test 

A 

analog  based  on  the  Oja  generalized  median  6,  (6).  The  natural  test 


function  is  Q  »  Q(0)  -  V  I  I  u(xi,xj;0),bufc  a  simple  randomization  argument 

i<j 

does  not  provide  the  null  distribution.  The  main  goal  of  this  section  is 
to  develop  an  analog  of  the  Wilcoxon  signed  rank  test. 

A  standard  device  for  producing  one-sample  univariate  rank  methods 
from  two-sample  procedures  is  to  create  a  second,  artificial  sample  that 
consists  of  the  negatives  of  the  original  sample.  When  we  consider  the 
univariate  rank  of  -xi,  say,  relative  to  x^,...,xn  the  result  is  the 
number  of  sums  x^+xj ,j=l, . . ,n  (or  averages  (xj+xj)/2]  that  are  negative. 

By  doing  this  for  each  data  point,  we  find  that  the  two-sample  rank 
method  produces  counts  of  the  signs  of  the  pairwise  sums  or  averages, 
and  these  counts  are  related  to  the  ranks  of  the  absolute  values;  see 
Hettmansperger  (198U,  Section  2.3).  Hence,  we  arrive  at  the  one  sample 
signed  rank  statistics.  This  device,  in  the  bivariate  case,  allows  us 
to  avoid  the  problem  of  introducing  absolute  values  in  the  plane. 

Returning  to  the  bivariate  case,  let  -xi,...,-xn  be  the  second, 
artificial  sample.  For  inference  on  0 ,  following  the  discussion  in  the 
previous  section,  let  A  *  20  and  consider 

n 

(15)  S(a)  *  l  Q2n<-Xi+A> 

i-1 

where  the  subscript  2n  on  Q  indicates  that  the  quantile  of  -xj+A  is 
computed  relative  to  -x^+A,. . . ,-xn+A ,X]_,. . . ,xn.  The  next  result  shows 
that  we  need  only  consider  Qn(-X£+A);  that  is,  the  quantile  of  -x^+A 
relative  to  x^,...,xn. 

Theorem. 

n 

S(A)  =  2  l  Qn(-xi+A). 


(16) 


Proof .  Let  A  =  0  without  loss  of  generality.  Now 


U(W-xi)  *  I  I  u(x«,xk;-Xi) 
j  k 


+  I  I  u(-xj,xk;-xi)  +  ^  J.  utxj.-Xfci-Xi) 


+  I  l  Uf-Xj.-Xjjj-Xi) 

i  k 

*  +  C^. 


Summing  on  i,  J  =  0.  Since 

u(a,b;c)  =  -u(h,c;a)  -  u(a,c;b)  and  u(a,b;c)  =  -u(-a,-b;-c) , 


IBi  “  I  I  1^  {-u(xk,-Xi;-xj)  -  u(-xj,-xi;xk) 

-  u(-xk,-xi;xj)  -  u(xj,-Xi;-xk)} 

“III  {-u(xk,-Xi;-Xj)  +  u(xj,Xi;-xk) 
i  j  k 

+  u(xk,xi;xj)  -  u(xj,-xi;-xk)}. 

Group  the  first  with  the  fourth  terms  and  the  second  with  the  third 
terms  to  get  =  -  Jb*  +  2jAj.  Hence,  J^B*  -  Ja*  and 

*»lQ2n(-xi)  =  2jAj. 


—  2*2  £  £  £  u^xJ»xk»” xi) 
i  j<k 

=  8  l  I  I  u^xj»xki“xi) 
j<k 

which  reduces  with  an  application  of  (4)  to  the  result  stated  in  the 
theorem. 

This  theorem  shows  that  the  estimate  of  location  6  =2/2  is 
defined  by  S(a)  =0.  A  further  interpretation  is  possible.  Note  that 


Hence, 


Qn(-xi+A)  *  Ml  I  l  u(xj ,Xfc;-xj+A) 
j<k 

*  V2.  l  l  u(xi+xj,xi+xk;A). 
j<k 


S(A)  =  V2  I  I  I  uUi+Xj.Xi+XfcjA), 
j  <k  i 
or 

x.+x.  x  ,+x, 

(17)  S(e)  =  1/2  II  I  ,  ^2^  ;  6)- 

j<k  i 

Analogous  to  the  univariate  Hodges-Lehmann  estimate  which  is  the 
median  of  the  pairwise  averages,  the  bivariate  estimate  0  is  an  Oja 
generalized  median  computed  on  the  coupled  pairs  displayed  in  (17).  In 
effect,  (17)  shows  how  the  data  can  be  symmetrized  before  the  median 
operation  is  applied. 

For  testing  the  hypothesis  H0:  0  -  0  we  could  use  either  lQ2n(“xi)  or 

lQn("Xi).  Under  Hq:  0-0  the  randomization  distribution  of  lQ2n(“xi)  is 

easier  to  work  with.  The  first  line  of  the  proof  of  (16)  shows  that 

Q2n(-Xi)  -  ~Q2n(xi)-  Under  the  null  hypothesis  each  has  probability  1/2  so 

that  the  statistic  S  -  lQ2n(”xi)  has  ES  -  0  and  covariance  matrix 

i 

c  *  lQ2n(-Xi)Q2n(xi). 

Tests  of  Hq:  0-0  are  carried  out  as  described  in  the  last  paragraph  of 
the  last  section.  The  test  statistic  is  a  scale  invariant  bivariate  analog 
of  the  Wilcoxon  signed  rank  statistic. 

6.  EXAMPLES 

In  this  Section  the  two-sample  and  one-sample  bivariate  rank  methods 
are  illustrated  through  application  to  two  data-sets.  First  consider  the 
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SKEWS' 
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two  sample  test.  In  the  following  data  from  Hettmansperger  (1984,  p.291), 
the  data  consist  of  levels  of  two  biochemical  components  in  brains  of  mice, 
xT  *  (X! ,X2) . 


Control  group 


xl: 

1.21 

0.92 

0.80 

0.85 

0.98 

1.15 

1.10  1.02  1.18 

1.09 

x2: 

0.6l 

0.43 

0.35 

0.48 

0.42 

0.52 

0.50  0.53  0.45 

0.40 

Treatment  group 


*1* 

1.40 

1.17 

1.23  1.19 

1.38 

1.17 

1.31 

1.30  1.22  1.00  1.12 

1.09 

y2: 

0.50 

0.39 

0.44  0.37 

0.42 

0.45 

0.4l 

0.47  0.29  0.30  0.27 

0.35 

The  corresponding  quantiles  of  the  control  group  observations 
among  the  combined  control  plus  treatment  sample  are 

Xlt  .89  -9.01  -8.82  -9.40  -6.78  -.51  -3-25  -6.24  2.28  -3.04 

x2:  18.53  3.65  -6.26  9.37  .76  15.70  13-63  15.85  5.27  -4.56 


and  the  sum  of  quantiles  is  ST  -  (-43.88,  71.94).  The  standardized 

test  of  no  location  shift  between  control  and  treatment  populations  is 

2 

STC_1S  -  15.137,  which  is  highly  significant  when  referred  to  X2*  By 

comparison,  a  univariate  rank  method  applied  componentwise  yields 
2 

X2  ■  14.22;  see  Hettmansperger  (1984, p.292). 

For  a  one-sample  test,  the  following  data  from  Hettmansperger  (1984, 
p.286)  are  systolic  and  diastolic  blood  pressures  of  15  adult  male  Peruvian 
Indians. 
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XI 

170 

125 

148 

140 

106 

108 

124 

134 

116 

114 

118 

138 

134 

124 

114 

x2 

76 

75 

120 

78 

72 

62 

70 

64 

76 

74 

68 

78 

86 

64 

66 

To  test  that  the  center  of  the  bivariate  population  is  (120, 80) ,  consider 
the  sample  of  (yi,yp)  =  (x^-120^  xp-80)  values  and  the  reflected  artificial 
sample  of  values  (-yj_,  -yp)*  The  quantiles  of  the  reflected  sample  among  the 
combined  sample  are 


71 

-3271 

-1441 

-1205 

-2814 

2064 

944 

-1346 

-2241 

596 

72 

1145 

1579 

-3184 

1190 

1321 

2858 

2522 

3276 

602 

yi 

1052.5 

-410 

-2483 

-2298 

-1384.5 

220 

72 

1088.5 

2822 

1238 

-552 

3634.5 

2911 

and 

the  quantile 

sum 

is  ST  ■ 

(-14017, 

22469). 

The 

standardized  ! 

statistic 

STC-1S  -  8.1*9,  highly  significant  when  referred  to  X2*  The  corresponding 
2 

X2  for  the  componentwise  univariate  rank  method  has  the  same  value;  see 
Hettmansperger  (1984, p.287). 
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to  be  numbered) . 


SECURITY  CLASSIFICATION  OF  THIS  PAGE  fWl«n  Dm tm  (mm« 


REPORT  DOCUMEHTATIOH  PAGE 


READ  INSTRUCTIONS 
BEFORE  COMPLETING  FORM 


rrjrunrrn 


«.  TITLE  f an*  Submit) 


Affine  Invariant  Rank  Methods  in  the 
Bivariate  Location  Model 


S.  TYPE  OF  REPORT  *  FCFI OO  COVERED 


S.  FERFORMInO  ORG.  REFORT  NUMSER 


7*  AUTHOR?*; 

B.  M.  Brown,  University  of  Tasmania 
Thomas  P.  Hettmansperger,  The  Pennsylvania 


*•  rerforming  organization  name  and  address 

Department  of  Statistics 

The  Pennsylvania  State  University 

tin*!  vpr«-f  f- v  P^it-V_  PA  1 


n,  controlling  office  name  and  address 
Office  of  Naval  Research 

Statistical  and  Probability  Program  Code  436 
Arlington,  VA  22217 _ 


M.  MONITORING  AGENCY  NAME  S  ADORES  S/If  4i//*ren<  /ran  Contrmltlnt  O  til  cm) 


S.  CONTRACT  OR  GRANT  NUMBER/*) 


N00014-80-C-0741 


18.  FROGRAM  ELEMENT.  PROJECT.  TASK 
AREA  *  WORK  UNIT  NUMBERS 

NR042-446 


I*.  REPORT  OATS 

September  1985 


13.  NUMBER  OF  PAGES 

17 


IS.  SECURITY  CLASS,  (ml  thlm  import) 

Unclassified 


IS*.  DECLASSIFICATION/ DOWNGRADING 
SCHEDULE 


IS.  DISTRIBUTION  STATEMENT  (o I  thlm  Hmp art) 


APPROVED  FOR  PUBLIC  RELEASE:  DISTRIBUTION  UNLIMITED. 


<7.  DISTRIBUTION  STATEMENT  (o t  the  «bf(rtci  entered  In  Block  20,  II  dlllerent  from  Report) 


19-  KEY  WOROS  (Continue  on  reeeree  aide  it  neceeeery  and  identity  by  block  number) 


Affine  invariance,  bivariate  location  model,  dispersion  measures, 
generalized  median,  multivariate  linear  models,  multivariate  quantile, 
multivariate  rank,  permutation  tests,  R-estimates,  Wilcoxon  tests. 


20.  A*?51  n  ACT  f  Continue  on  rover ee  tide  it  neceeeery  md  Identify  by  block  number) 

The  generalized  multivariate  median  of  H.  Oja  is  used  to  define  a 
multivariate  notion  of  quantile,  or  rank,  and  to  define  a  measure  of 
scatter  of  multivariate  linear  models.  The  latter,  when  applied  to  the 
one-  and  two-sample  bivariate  location  models,  yields  affine  invariant 
analogs  of  the  Wilcoxon  rank-sum  and  signed-rank  tests,  and  of  the 
corresponding  estimates. 


I  JAN  71  1473  EDITION  OF  I  NOV  «»  I*  OBSOLETE 

$  N  0102-  LF-  0!  *•  6601 


_ Unclassified _ 

SECURITY  CLASSIFICATION  OF  THIS  PAGE  fW»i*n  Dmtm  Intmrmd) 


